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WHEN  IT  SEEKS  DESIRABLE  TO  I  CHORE  DATA 

by 

HERMAN  OIERNOPF 

Massachuset ts  Institute  of  Technology 

ABSTRACT 

An  experiment  designed  to  detect  the  relative  notion  of  two  astronomical 
ob  *,e*.  ta  raised  ihe  problem  of  testing,  .iguinst  shift  alternatives,  the 
'.e  the*  la  th:«t  rvo  «*>ergy  distribution*  are  equivalent.  The  relevant 
d.i'j  corslet  of  Independent  Poisson  count*  with  means  Ijp^Tjj  where 
•  i*.  the  *  "tensity  of  radiation  from  the  }-th  object,  la  the 

pr..*».ibl  1  Itv  that  a  random  photon  from  the  J-th  object  has  energy  In  a  small 
irt.  tv-1  centered  about  *  {  Is  the  time  duration  allocated  to  the 

count  X  ^  j .  The  hypothesis  implies  that  •  pjj  for  1  •  1,  2, 

A  i-atur.il  t,  st  u-e*  the  statistic  ^*j(p12  -  r*||)  where  the  are 
estimates  of  pjj.  Tor  Intervals  where  the  were  anticipated  to  be 
stall,  !*•*  espv r les-ntet  <-ho»«?  small  value*  and  hence  those  p,^  were 
highly  v-riiMe.  t  onsequent  lv ,  '■own  sense  suggests  that  tha  corresponding 
|  tj,  and  Xj.  be  omitted  In  the  above  statistic,  a  practice  which  may  be 

regarded  as  sinful  by  statistical  dogma.  This  Issue  and  others  raised  by 
the  effeits  of  small  lead  to  the  consideration  of  alternative  teat 

statist i<s  and  their  relative  efficiencies  as  well  aa  the  design  problem 
of  selecting  T^. 

Kev  Words:  Hypothesis  testing,  optimal  design,  pitman  efficacy,  Poisson 
ATS  1980  Subject  Claasiflcatlcms:  Primary  b2F0>;  Secondary  62G10; 
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A  satellite  baaed  experiment,  designed  to  detect  a  Doppler  -fleet 
measuring  the  relative  motion  of  tvo  neighboring  sstronoetr il  oh  .  t«  , 
uses  an  Instrument  wt  lch  can  count  the  number  of  photons  rc>‘*‘Vviui  ;r^c 
either  one  of  these  objects  within  a  narrow  rnergv  hat:  !,  f  >r  .i  *.  . .;f-.  1 
duration  of  time  (?].  The  total  allocattJ  ttc«-  Is  .jtvtded  ,M.  g 

length  intervals,  each  assigned  to  a  distinct  energy  batul  j.<d  t  :.it 
of  the  astronomical  objects.  A  St ralght forv.ird  .  *  **t  ri  . 

data  raises  some  puzzling  questions  v!:U!i  will  '*  addiec.e.'  'me. 

Tne  above  analysis  i  omparcs  the  extiTjat  e.',  itv  a-  differ-  •  .•  .- 
levels  of  the  two  objects  with  Its  estimated  stand, r.'  i*rv'  it  .«-n  o  s.-e  j# 
there  is  a  significant  difference.  However,  stn-e  t’u-  tj-fii.r  :  ie*ig  »“ 
allocated  relatively  little  time  to  energy  bands  In  -huh  the  fr-quo  > 
of  photons  were  anticipated  to  be  small,  there  are  s><ae  lands  wit*,  v»»;\ 
small  or  zero  counts.  With  the  analysis  used  It  Stems  pieferabl*. 
Intuitively,  to  ignore  the  data  In  some  of  these  short  tia*.  lov  -oat: 
Intervals.  Moreover,  with  the  Inclusion  of  one  of  these,  tie  *  rue  •* 
a  single  observed  zero  count  to  a  count  of  one  would  have  a  otof ’unq 
effect  on  the  significance  of  the  results. 

Statistical  dogma  regards  the  ignoring  of  data  -s  sinful,  yet  coTton 
sense  seems  to  urge  us  to  coalt  this  sin.  The  fact  that  the  c  'cii,  of 
an  extremely  expensive  experiment  Involving  hundreds  of  count*  when 
"the  action  Is",  should  be  greatly  affected  by  the  absence  or  pr.-  .nice  f 
a  single  count  Is  puzzling.  Clven  another  opportunity  to  lepeat  this 
experiment,  how  should  the  time  Intervals  be  a!  1  m  ated’.  Most  of  th,  .„c 
issues  are  addressed  In  this  paper. 
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This  situation  does  not  qualify  at  a  paradox,  since  procedures  that 
do  not  make  optimal  use  of  the  data  may  perform  better  by  ignoring  noiay 
data  liable  to  be  aaelgned  too  great  a  weight.  A  minor  adjustment  of  the 
procedure  can  reduce  the  effect  of  the  single  count,  an  effect  which  is, 
to  lorn-  extent,  a  small  sample  phenomenon.  However  asymptotic  analysis 
reveals  that  the  main  issues  are  not  of  a  small  sample  nature  and  that 
the  "natural"  analysis  of  the  data  make*  considerably  less  than  full  use 
of  the  available  information. 

The  exper Iment  Is  described  In  Section  2.  The  "natural”  analysis 
and  its  difficulties  together  with  some  mitigating  modifications  are 
presented  In  Section  3.  The  discussion  in  Section  A  of  a  parametric 
one-aar.ple  version  of  the  problem  contributes  some  insights  and  bounds 
on  efficacy.  Finally,  in  Section  5  several  alternative  approaches  are 
describe,,  und  evaluated  for  the  original  two-sample  problem. 

in  Section  b,  the  efficiencies  oi  several  approaches  are  calculated. 
The  signi t learn e  levels  achieved  using  these  methods  „re  presented  for 
four  observed  data  sets.  Flnalty.  there  la  an  AppmndLx  where  detailed 
derivations  arc  presented  tor  some  of  the  less  ob.lous  results  presented 
in  the  first  five  sections. 

I  wish  to  thank  Joseph  Csstwlrth  for  the  benefit  of  some  Illuminating 
discussion  and  the  suggestion  to  use  the  Mann-IAtltney  approach  discussed 
In  Sec  t  Ion  5. 


2.  Experiment  and  Notation 

Let  Xj  represent  the  number  of  photons  observed  during  t'-e  1-th  t 
period  of  length  from  a  source  of  Intensity  1  In  a  narrow  band  of 
energy,  centered  about  e^,  which  contributes  a  proportion  p^  to  th«  |- 
Thus  we  have  Independent  observations  X^,  X,,  ...  where 

(2.1)  ^(Xj)  -ftlpj,)  1  •  1.  2.  ...  a 


and  where  (X)  represents  the  distribution  of  X.<?(')  represents  the 
Poisson  distribution  with  mean  t,  and  Pj  •  l.  I^t 


(2.2) 


1  "  I  Pi*l 

i-J 


be  the  mean  energy  of  photons  from  the  source  (neglecting  the  vftvt 
due  to  variation  of  energy  within  a  hand). 

The  parameters  X,  p^,  and  u  may  be  estimated  by 

<2.3)  ; .  i  i  f1 

l-i  * 

(2.*) 

and 


1  "  £  Vt 

l-l  1 


(2.5) 


* 


Asymptotic  analysis  Indicate*  that  for  large  XT(,  tha  distribution  of  V 
ia  approximately  normal  vlth  Man  v  and  variance 


(*-.<•.  -  >■>* 
1-1  11  i-i  1  1 


where 

<:.!>  »,  •  ,  1-1.2 . •  . 


For  the  two  iuolt  problem,  we  Introduce  X^,  X^,  Tjj,  T^,  Xj,  X^,  u^, 
v,.  Pjj.  Pl2,  otJ.  0j:.  wtl  and  wl2-  To  teat  the  hypothesis  Hq  that 
there  la  no  difference  in  energy  dlatrtbutlona  we  apply 


where  o^2  la  derived  from  o2  by  replacing  X,  Pj  and  v  by  X^,  p^  and  u^. 
Under  the  null  hypotheala  of  no  difference,  Z  ehould  be  approximately 
normally  distributed  with  atm  0  and  varalnce  1. 


I 

I 


Tha  expert want  consisted  of  four  separate  parte,  each  of  which 
contributed  to  rejecting  the  null  hypotheala  Hq.  We  present  In  Table  1, 
tha  data  from  one  of  the  parts  which  Illustrates  the  problem  rai-xd  :n 
the  introduct Ion.  The  analysis  presented  there  la  based  on  the  use  of 
only  the  data  between  e^  -  2)80  to  e^  -  2660. 

The  above  analysis  Is  the  f.rst  performed  by  the  autfor  cn  this  d  • 
set.  Here  those  values  of  1  whl--a  si.ned  Intuitively  unsafe  vert  1,  .*■« 
It  was  compared  with  an  Independent  previous  analysis  b-  t‘e  earner » 
which  turned  out  to  be  equivalent  In  formula,  but  different  !r  •*  *t  ir- 
had  included  the  case  -  2)60.  They  obtain 


Xj  -  0.00538  fcj  -  2601.?  Oj  •  *.65 

Xj  -  0.00*32  P2  -  262*..*  o 2  •  *-69 


The  substantial  difference  between  the  two  results  Is  •  :e  i  -  ?.r: 
to  the  discrepant  weights  given  to  1  •  0.  Indeed,  these  w«  s^his  h«  . -» 
Wqj  •  0.16  and  Wq2  -  0.00.  Moreoever,  If  the  count  of  «  0  u  r***la. 
by  -  1,  there  ia  another  dramatic  change  with  the  resulting  2  - 
To  a  substantial  extant  these  effects  can  be  regarded  as  sn.«lt  su-r'le 
effects  and  they  can  be  reduced  considerably  by  the  simple  device 
described  below. 

Under  the  null  hypothesis  Ho<  p^  and  p(  ,  have  a  u«ir  value  j  ^  . 
TTt«  high  variability  of  tha  contributions  ro  ctj 3  and  of  the 
p^  and  Pjj  would  ba  reduced  considerably  If  the  estimates  uf  varla-e 
Oj2  and  flj2  used  the  simple  pooled  estimate  of  p^. 


6 


TABLE  I.  Data  and  Estimates  of  Parameter#  and  Weight* 
Based  on  Data  for  e  Hanging  from  2580  to  2660 
Electron  Volta,  and  a  Unit  of  la  0.32  Seconda 


1 

*1 

Til 

*ll 

Ti2 

X12 

Pil 

pl2 

Wil 

"12 

(3.1) 

‘i'll  ♦  *?*12 

pn  ’ 

»»♦  *j 

25*0 

3803. * 

3 

3.4 

0 

0 

2560 

9017.0 

8 

1012.* 

0 

Another  possibility 

would  be  to  use  the  normalised  pooled  e*t leates 

! 

2^00 

94*19.8 

12 

5831.1 

4 

0.28 

0.16 

0.26 

0.23 

2 

2600 

9281.0 

13 

9606 . 4 

4 

0.31 

0.10 

0.29 

0.09 

9. 

1 

2620 

9132.5 

6 

9518.8 

13 

0.15 

0.32 

0.14 

0.28 

(3.2) 

pi4  -  ; 

•  qi 

4 

2oi0 

8:63.0 

7 

9281.2 

9 

0.18 

0.22 

0.18 

0.21 

j-i  i 

5 

2660 

5425.4 

2 

9082 . 7 

8 

0.08 

0.20 

0.13 

0.19 

2600 

0.0 

0 

8402.2 

3 

where 

2/00 

0.0 

0 

5177.2 

3 

2  720 

0.0 

0 

9  30.8 

0 

(3.3) 

•i’ll  * 

,  •  0  00449 

1  -  2609.1 

Oj  -  *  *3 

Z  -  2.33 

In  Table  2  we  summarise  the  results  where  c  -  2560,  l.e.  1  *  0,  is 

2  -  p.*y*-.32 

,-2  -  2624.4 

Oj  -  4.68 

1 

Included  and  excluded  for  each  of  the  approaches  and  for  loth  *r 

and  Xq2  *  1  when  I  •  0  It  IncluJed. 

The  pooling  prescriptions  ameliorate  »■«  nvideraMv  ,-n«-  :  ti-.e 

problem*  we  observed,  but  they  fall  to  address  the  philosophy  at  cjmiIch 
of  what  right  we  have  to  omit  data.  It  1*  clear  from  Equation  (2.6)  th.** 
It  la  unwlae  to  ln-lude  e^  values  for  which  the  are  relatively  sria’.l 
even  If  these  are  absolutely  large.  The  problem  Is  not  merely  *  «ai!l 


1.  I'npooled 

TABLI  2. 

U1 

u2 

°2 

Z 

sample  problem  for  which  our  prescription  of  pooling  ot  .,■« 

expedient  would  be  adequate.  On  the  other  hand,  as  lndl  a' 

Introduction,  It  hardly  qualtflea  to  be  called  a  paradox,  t 

never  was  any  claim  of  optimality,  asymptotic  or  otherwise, 

2560  excluded 

2609.3 

4.43 

2624.4 

4.60 

2.33 

procedure  used.  When  a  auhoptitaal  procedure  la  uaed  it  ta 

?S*»0  included 

2601.2 

4.65 

2624.4 

4.68 

3.51 

surprising  to  find  that  supresslng  data,  which  nay  be  g*.  vr 

2560  Included,  1^*1 

2601.2 

4.65 

2612.4 

10.52 

0.97 

by  the  procedure,  m/iv  be  desirable. 

2.  Slrpl*  Pooled 

2560  excluded 

2609.3 

5.03 

2624.4 

5.12 

2.09 

2560  included 

- 

5.25 

2624.4 

10.51 

1.97 

2560  lnc’uded,  *Q2*I 

2601.2 

5.32 

2612.4 

10.23 

0.97 

3.  Noras  1 1  red  Pooled 

2560  excluded 

2609.3 

5.16 

2624.4 

5.19 

2.06 

2560  Included 

2601.2 

5.41 

2624.4 

12.45 

1.71 

2560  Included, 

2601.2 

5.39 

2612.4 

9.90 

0.99 

■e<  other  -id  .  i 
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4.  A  One-Sample  Version  of  the  Problem 

While  w«  «ay  not  be  dealing  with  ■  paradox,  the  theoretical  problem 
at  1 1 1  remain*  of  how  one  should  properly  analyze  the  data  to  that  the 
sin  of  Ignoring  data  can  be  avoided.  Of  course,  this  doe*  not  vitiate 
the  previous  analysis.  It  provides  an  opportunity  to  compare  that  rather 
straightforward  analysis  with  some  sort  of  optimal  alternative  to  aee  If 
there  has  been  a  aerioua  loss  of  efficiency.  A  further  consequent 
of  the  solution  of  the  theoretical  problem  Is  that  It  casta  light  on  the 
problem  cf  the  optimal  c.toice  of  the  time  Intervals  T^. 

Let  us  assume  that  the  Pj  are  of  a  prescribed  functional  form, 
uniquely  determined  except  for  a  one  dimensional  parameter  <)  representing 
the  velocity  of  the  astronomical  object  studied.  Then  we  have  a  problem 
Involving  U  parameters  *j,  ‘s j*  where  5^  determines  the 

pr  p.  «-:i  rs  Pj^Pji'jK  J  -  1.  2.  For  asymptotic  considerations  ve  are 
concerned  mainly  with  the  estimate  of  ”2  ~  Al* 

We  present  a  more  or  leas  standard  analysis  deriving  score  functions 
4nd  irfom.it ion  matrices  using  the  likelihood  function.  For  simplicity 
let  ua  first  consider  a  one- sample  version  of  our  problem  with  parameters 
and  data  Tj.Xj  where  T  ■  .  is  assumed  to  be  large.  The  likelihood 
fwctloti  I* 

■  -*T  P  <*>  X 

(4.1)  L  -  "  e  l»T  p  (9)1  /XI  , 

1-1 

and 


log  L 


is  ■ 

»  r  T  p,  (0)  «  I  ».l»tI>T1P.(‘l)l 

1-1  1-1 


lo,  X,! 


random  variable. 


S" 11  “0th'r  «"**•«  ->*  .  «WWIIIc  t..t 

*UCh  **  »«.  -  «.»U  v.,  an  „tim.c.  „ 

’  '>’  ‘  ?IE1  *  E*'  """  E1  "*  E2  «™  the  of  independent 

ro.-,<)o.  photon.  fro.  the  tvo  astrono.lc.l  object,.  For  thl.  we  us. 


(S.8> 


*ti  “'J* '  ' 


It  fan'r  exactly  clear  whet  the  neturel 


general  1  ration  of  nonp.ra- 


•*  ">  'M.  context.  On,  propoa.l  „  u  , 


A  *  '*•)*  "  *  ,k~  ■J'— 'trie  matrix.  The  atatf.tlc  ire  „  ,  Hp.cl<1 

J‘'  "  S  *‘J  ■*“''  )  a(J  •  ,f  ,  ,  j.  For  tearing 

‘I*  ""  *'4f' ‘  tlv*  Iv  equivalent  statutlc  to  1. 


ViV  .-.■■uf.Tr 


TO  evaluate  the  efflc.c.a  „f  the.,  at.tl.tlc  ..  c.lcul.t.  ,h.  . 
nog  -nance.  the,,  aay^to.Ic  dl.trlhu,,,™. 

tv T  V, 


■’ll’ 

Vu' 


(5.11) 


for  vr , 


V  ' 


’  JV»12  "  PM> 


*  *«**  ,[/.l*I J»JI  *  Vl  -  p|;, 

<s.n> 

’  vv  I  Lv,t  *JW  *  «{•,/, i 

«^ere  86  and  »7*are  described  below. 

Ut  "  th"  P‘J  ’  W  I.  •  tranautlon  parawt 

l0**  ^  6°’  n,*B  Ca°  b*  ■PPronlawt *d  bv  erpresM .  ,,f 

For  the  etatratlre  .■  . 

the  efficacy  for  testing  H  :  e  •  {•  t.  »«,.  .  -l  ,  6 

*  o  !  **  *•  *!'*»  by  T  'fa’M  /(a-6.,,. 

°PU»1  choice  of  a  would  be  B1* 

_  •»*-  the  vorrexp-ndln,;  ;  ,  , 

If  »  were  -=n.,n,u,ar.  -owe.,,  ,  I,  am*.,.,  |hf„  ,  _ 

which  hoth  .,-0  and  .  0.  then  ,„d 


of  t>- 


the  optioal  a  and  efficacy.  The  atatlatic  W*  sust  be  treated  differently. 
There  p Is  appro* lasted  by  an  expression  of  the  for* 
<'V<,l*',l42^qiPi<eo),Si  *nd  °91  by  ***** 


<1  * 


The  efficacy  of  «9*  la  T  1  (I^pj  (e0)6t  J2/l^j2Pj  (®p)  /«j  J  «d 

can  be  nwixlnlxed  easily  hv  the  Method  ef  l.a grange  Multipliers  when  we  note 

that  the  condition  that  .  a^|  be  skew  syanetric  la  equivalent  to  the 


"i  '  vi5i"i  *  'j“i 


where  jj  and  are  Lagrange  Multipliers.  More  details  on  theae  resales 
follow. 


Tor  we  have  4^  and  B  ,  given  by 


>lot  p,(9  ) 

*16  '  *»,«.> - S" - «, 


and 
(3. Id) 

wtiere 


odd’  -  dp'<9  >  -  £(fl  )d’ 


with  d  -  P4 (&„)«..  D  ■  dlagfd.)  and  a  •  Jp. (9  )u”'.  hole  that  b* 
liol  l  O  1  o  1  6 

and  gjv.  -  0  where  v  -  u~*. 
o  tl  lt>  1 

For  W?  we  have  _$,*  and  B ,*  given  by 


(3.20) 

and 

(3.21) 


»log  P.t«  ) 


V  •  h'V  —TT 


-  0  •  •„(!<»>[<-  I  •  - 


with  D  •  dlag(d  ),  d.  -  p.(6  )u  ah'!  •  •  Id,  Is  as  above.  K-Je  j 

1  1  101  01 

we  note  that  8*v^  •  0  and  6*v^  •  0  for  •  1. 

For  we  have  6^^  ■  -Slog  P1(?o>/3?  and  u  Is  as  above.  Apr l 

the  Method  of  Lagrange  Multipliers,  we  derive 


I 


i 


J 
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.»'•  !  'Sr  maximum  value  of  the  efficacy  Is 

For  each  of  the  statistics  W^,  U^*  and  U^*,  the  f>esc  choice  of 

•  efficients  a{  depends  on  knowledge  of  p^(6).  While  an  Incorrectly 

•■pec  l  fied  .'cl  for  p ^ >  will  lead  iotMiboptlm.il  choice  of  coefficients, 

resulting  statistic  ran  still  be  used  to  test  the  hypothesis  H  under 

o 

wh.  h  th*  isean  is  rrro  and  the  variance  may  be  estimated  by  using  the 
rvi’lej  estimates  of  ^  ■  p{,. 

tr>  the  ®eant  1  em>  one  My  compute  the  relative  efficiencies  of  these 
statistics  under  the  sssu option  that  p^(rj)  Is  of  s  special  for*.  In 
particular,  for  our  special  problem,  the  foras  of  Interest  are  those 
where  the  underlying  density  Is  normal,  or  a  Mixture  of  tvo  or  three 
normals,  and  p^j  Is  the  Integral  >'f  this  density  over  the  range 

-I  *  -J  ■  VJ  w'"’  "  ■  hu  ■  V 

Several  problem*  have  been  neglected  and  reaaln  somewhat  unresolved, 
rheie  are  due  tvi  two  facts.  First  the  energy  bands  of  width  h  are  not 
very  narrow,  and  second,  there  la  a  truncation  effect  since  there  nay  be 


substantial  radiation  outside  of  the  bands  to  which  tine  has  been  alio.  j*. 
Thus,  while  deviations  fro*  may  be  detected  bv  observing,  fer  large  1, 
th*t  »u  *  hi  ,  the  extent  of  the  translation  can  not  easily  be  deterr.l 
without  as9'UBing  a  functional  for*  for  the  density  of  the  enery- .  v.  r  j» 
It  possible  to  check  that  th*  failure  of  corresponds  to  a  tr  i-ihui  r 
of  the  energy  distribution. 

This  p  rob  ten  is  relatively  minor  if  there  are  many  n  it  row  enetg*. 
bands  and  If  there  Is  little  density  In  those  bands  near  the  h  \.-.dar>  of 
the  region  to  whlth  substantial  time  has  b.-«n  allocated.  »’  *r.:  e. 
even  '  pU)<i  ts  •**ir*y  to  *  Poor  estimate  «f  the  s'  •  !  •  11 

the  hands  are  wide  ■s.iv  •  —  t  .tnurately  represent  t average  .  «  tg  m 
the  band.  If  there  1&  substantial  density  at  the  boundaries  of  'he  '«-gi  • 
studied,  the  probabilities  hiring  estimated  are  rtfollv  •  .'  1  j  j  '  j  . 

given  that  the  energy  is  In  the  prescribed  region.  P'.-n  .»  t.  :  •*. 

shift  In  the  energy  distribution  -  *s  tint  r.  rtv'j  -mi  ex  .  «  iv  r.  t 
translation  shift  in  the  conditional  distribution. 

If  these  difficulties  were  negligible,  one  c  vild  estimate  the 

trar.slat  lv>it  paranvtt r  by  t>ei?i:.K  l..w  much  of  r,h  1  f  t  h  from  c  c* 

i*h  ,d  *  t : 

would  be  required  for  one  of  the  resulting  statistics  W^,  W^*.  tc 
cross  re  re. 

Another  question  that  has  not  been  carefully  examined  Is  tie  relevan 
of  Pitman  efficiency,  j  measure  designed  for  dealing  with  small  shifts, 
if  the  shift  is  Substantial  compared  to  the  standard  deviations  <-!  the 
components  of  the  density  ro  spending  to  rhe  modes  nr  s;  e  1 1 a  1  lines 
studied. 
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6-  Tables  of  Efficacies  and  Significance 

The  experiment  yielded  four  date  sets,  the  fourth  of  which  appears 
In  Table  1.  The  others  are  presented  in  Table  3.  These  data  suggest 
models  for  p^(e)  of  the  form 

fi+l/Z 

p.(9)  -  1  f(*  -  9)dx 

'1-1/2 

where 

f(*>  -  l  e  “1*U/o .)dx 

and  ♦  is  the  standard  normal  density.  These  models  were  used  for  our 
theoretical  evaluations.  This  method  of  choosing  models,  loosely  fitting 
the  data,  may  bias  our  results  to  yield  apparent  efflclences  somewhat 
larger  than  deserved  for  methods  whose  coefficients  are  "tuned"  to  optlmlte 
with  respect  to  these  "fitted"  models.  The  parameters  uf  the  models  are 
presented  in  Table  *.  The  efficacies  of  various  methods  are  presented  In 
Table  5.  These  methods  are  applied  to  the  data  sets,  and  in  Table  6.  the 
corresponding  Z  values  are  presented.  Since  the  significance  level  or  P 
values  are  given  approximately  by 

P  -  *(2) 

where  9  is  the  standard  normal  c.d.f.,  these  levels  are  not  pretexted 
explicitly.  More  details  follow. 
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Table  4.  Parameters  of  Three  Models  for  p^(6) 


Model  1 

Model  2 

Model 

3 

*1  X2 

n 

X1  X2 

n 

X1 

X2 

0.0165  0.0135 

1 

0.0263  0.0275 

: 

0.0045 

0.0043 

,Jj  cj 

J 

Uj  °J 

1 

3.5  1.6  0.15 

7.5  1.6  0.36 

11.0  1.6  0.49 

1 

8.0  2.5  1.0 

i 

4.0 

2.0  1.0 

The  efficacies  presented  In  Table  5  are  for  the  statistics  testing 
B^sp  ^  •  P12§  1  <_  1  <  •  against  local  shift  alternatives  to  «o*:  the  model 
applies  with  »  9j  (approximately  0). 

1.  E  Is  the  efficacy  of  the  optical  parametric  test. 

2 .  E^,  Eqj.  and  EQ9  are  the  efficacies  of  Wft,  and  us  leg 
the  optimal  coefficients 

3.  E*?  is  the  efficacy  of  the  "natural  test”  based  ■-'?  the  statistic 
W  “  tei(pl2  *  ?11)  or  n*  «9ulvalent  WJ7  "  -u?l2  -  ?u». 

(*1  *  *o  *  bl)’ 

4.  ^  Is  the  efficacy  of  Wg.  the  Mann -Whitney  version  ot  . 

For  each  model,  the  data  set  is  relevant  only  In  the  values  of  T  .  «*-d  " ^ , 
(design  parameters)  used,  and  the  observed  counts  sre  not  relevant. 

For  each  model  and  data  set  combination,  we  consider  several  alternative 
subsets  of  the  available  counts  for  Inclusion  in  the  analysis.  Thus  if  v« 

i2 

take  lj  <_  1  _*  ij.  T  •  ^  (T^  ♦  T^)  and  TE  is  of  Interest  as  well  cs  E. 

Computations  show  that  E^  •  Eq7  •  E^  seems  tc  be  true  In  general. 

This  Is  not  very  surprising  end  should  not  be  too  difficult  to  establish. 

In  particular  the  equality  of  E^  and  was  anticipated. 

Clearly  vc  should  have  Eg  >  E^.  possibly  because  of  the  loose  fitting 
of  the  models  to  the  observed  data,  Is  very  close  to  E^.  On  the  ether 
hand  vj7  la  sometimes  poor  and  sometimes  very  sensitive  to  the  choise  of 
data  to  be  Included  in  the  analysis.  Generally  Vg  does  better  than  V*? 
and  la  leaa  sensitive  to  the  choice  of  data  to  be  Included. 


In  Table  6  we  present  the  values  of  1  •  W/-?fc.  corresponding  to  various 
estimators  applied  to  the  actual  data  seta.  For  each  estimator  two  Z  values 
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Table  5.  EtflMdei  of  Testa  lor  *  pi2'*  <  i  <  ■) 

E  *  105«E,  T  ■  io"5*t 


A 

!l 

i. 

e 

o 

E06 

E17 

E8 

T 

TE06 

«!» 

< 

’ 

2 

1) 

1.67 

1.55 

1.24 

1.32 

1.31 

2.03 

1.62 

1 

; 

14 

1.79 

1.55 

0.87 

1.C5 

1.36 

2.11 

1.18 

i 

:.2u 

1.1* 

C.  78 

0.68 

:.:7 

1.3-4 

0-91 

’ 

2 

2 

14 

7.01 

6.80 

6.53 

6.77 

0.81 

5.51 

5.29 

2 

2 

1 

15 

6.68 

6.40 

5.22 

6.0m 

0.97 

5.57 

4.54 

2 

2 

1 

14 

6.85 

6.60 

5.60 

6.39 

0.84 

5.54 

4.87 

‘ 

? 

3 

13 

7.  31 

’.30 

6.99 

7.08 

0.74 

5.25 

5.17 

3 

1 

3 

1  3 

2. IP 

2.09 

1.39 

1.59 

2.38 

4.97 

3.31 

3 

1 

3 

14 

2.33 

2.23 

1.55 

1.75 

2.46 

5.49 

3.s: 

J 

1 

• 

12 

1.49 

1  .4B 

1.01 

1.13 

2.11 

3.12 

2.13 

4 

3 

2 

7 

1.9? 

1.67 

1.12 

1.11 

0.»5 

1.59 

1.06 

A 

3 

1 

7 

3  .40 

0.01 

0.01 

0.99 

1  .47 

0.01 

u 

3 

3 

6 

1.7  3 

1.71 

1.69 

1.68 

0.71 

1 .21 

1.20 

4 

3 

3 

7 

1 . 92 

1.87 

1.84 

1.84 

0.85 

1.59 

1.56 

d  •  data  **?  number 
a  -  model  number 


Table  6.  Significance  Levels  and  Estimated  SM» 


Z  values  corresponding  to  various  test  statistics  and  data  sets.  P  -  $  (-7. ' 
where  J  is  the  standard  normal  c.d.f,  Data  set  5  is  dan  set  4  with  X  replaced 
by  1 . 

S  values  are  estlnated  shifts  in  energy  distribution  normalirei  by  dlvidt-n 
by  energy. 


d 

■ 

‘i 

*2 

K, 

z*7 

zs 

* 

*04 

Z19 

S07 

S!7 

1 

1 

2 

13 

2.6 

2.6 

2.-4 

3.0 

2.9 

0.0048 

0.00-9 

1 

1 

1 

14 

2.8 

2.8 

1.8 

2.2 

2.8 

3.3 

O.^Obi 

C.  0-366 

1 

I 

3 

12 

2. 1 

2-3 

1.8 

1-9 

3.1 

2.2 

0. 

0.  '.  >  > 

2 

2 

2 

14 

1.4 

1.5 

1.0 

1.4 

l.l 

0.5 

0.00-9 

O.C.'.’S 

2 

2 

1 

15 

1.5 

1.5 

l .  I 

1.4 

1.5 

0.5 

0.C040 

C.'.OJO 

2 

2 

1 

14 

1.5 

1.5 

i . : 

1 

1.1 

O.n 

0.002 

2 

2 

3 

13 

1.9 

2.9 

2.8 

:.o 

1.8 

1.9 

0 . 90  ’>  9 

0.  •’ 

3 

1 

3 

13 

2.9 

2.9 

2.0 

2.3 

2.6 

3-1 

0.009- 

o.  : .  i: 

3 

1 

3 

14 

2.7 

2.7 

1.8 

:.i 

2.0 

3.1 

0.0100 

C.i  •• 

3 

1 

4 

12 

3.8 

3-6 

3.4 

3.5 

3-6 

s.<» 

O.O'lnJ 

Pa  ... 

4 

3 

2 

7 

2.6 

2 . 6 

2.) 

2  .- 

2.8 

1.9 

o.cii: 

O.T.  -i 

4 

3 

1 

7 

1.3 

1-3 

C.2 

0.2 

2.7 

1.9 

C.OlSt. 

o ,u. . : 

4 

3 

3 

6 

1.9 

1.9 

1.8 

1.3 

;  .c 

1.8 

C .02  7 1 

c  *  •  ■ 

4 

3 

3 

7 

2.3 

2.3 

2.2 

.  .  3 

2.3 

2.7 

o.<«:>- 

5 

3 

2 

7 

2.0 

2.0 

1.0 

1.0 

2.1 

1.5 

C . 02 J- 

5 

3 

1 

7 

13 

1.4 

0.2 

0.2 

2.0 

1.6 

0.015- 

O.t'J'  ♦ 

d  -  data  set  number 
■  -  model  number 
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were  calculated  corresponding  eo  the  use.  respectively  of  p^  and  p41,  In 
tV  estimation  of  the  standard  deviation  of  the  V  atatlatlc.  Since  these 
Z  values  were  almost  always  very  close  to  one  another,  only  those  corresponding 
to  p^  are  presented.  For  W^,  three  versions  of  Z  were  calculated  corresponding 
to  the  related  statistics  W*  ,  U*  and  If?  .  The  first,  uses  the 
coefficients  al>Q  selected  so  that 


where  Si0  are  the  optical  values  of  ql  according  to  the  appropriate  node l . 
Since  the  c'-vUe  of  the  a^Q  is  not  unique,  they  were  selected  so  that 
*1  i ♦!  0  "  *1  1»’  0  *  ■ • •  ■  The  statistic  V,93  uses  similar  coefflcents 
subject  to 


|V|1PJ»  *  '<10 

and  la  derived  similarly.  We  present  2  values  for  and  W*^j, 

l.e.  Z*^  and  Z*^.  Note  that  Table  6  Involves  the  model  and  1^  and  1^ 
since  these  determine  the  coefficients  of  W^,  W^,  «nd  the  Kj  statistics. 

Finally,  estimate*  of  the  mean  shift  9^  "  were  calculated.  To  be 
more  specific  J2  "  ui  “  bg*7  ••tl“*tes  u2  ”  W1  "  W*P12  “  PU^’  th*  shlft 
In  the  distribution  of  energy.  The  relative  velocity  of  the  two  astronomical 
objects  li  approximately  proportional  to  (u^  ”  i»j)/eQ  which  is  estimated 
by  •  bV*j/«o.  To  estimate  this  same  parameter  using  W*?  the  following 
coarse  technique  was  used.  Compute 


liiJL 


to  represent  the  value  of  if  ^  and  9,  were  each  shifted  by  1/2  In 
opposite  directions.  We  a  ppm*  Irate  s.  the  rvrber  of  intervals  b;.  which 
” 2  ”  tust  be  shifted,  to  Sa«e  the  -shifted1  V^j  *tatl«tic  rero  *>'. 

“Jj  *  •(k0*  •  “2j>  -  0 

and  let  the  corresponding  ve’.ociv.  he  estimated  by  the  norma ll red  shift 

*07  -  h'f„  -  -  •$;> 

A  coarse  estimate  of  the  coefficient  of  variation  of  S*.(S?7>  Is  !/Z*.fi/2 
These  estimates  are  unreliable  because  of  the  width  of  t‘ <?  cells,  the 
truncation  effects,  and  the  likelihood  that  the  true  shifts  are  not  small. 
The  estimates  S*.  and  are  quite  v^rlsb'.e  with  $*.  genera*’./  s.  t  ii 

larger  than  S*  . . 

The  careful  reader  say  note  that  there  Is  a  nontrivial  dlfierer.c  tet 
the  Z  values  In  Tables  2  and  6  corresponding  to  data  set  4.  This  dlfieren 
Is  due  to  the  fact  that  In  Table  2,  Equation  2  .h  was  applied  with 
u.  •  subat ltut lng  for  u  when  J  •  1  and  i*2  •  te^p^  when  )  -  2.  On 

the  other  hand  for  the  calculation  In  Tabic  b,  was  substituted 


for  u  in  both  cases. 
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A. 2  Efficacy  of  8  -  *0 1 

(*)  «  -  eQ.  for  •  close  to  0g 

(A2 . 1)  S(0.TJ:2O.!1o)) 

Hence  the  offleocy  of  8  -  9Q  (pet  unit  tine)  !• 


(A24) 


<A2. 2) 


»  I  >1 

I<».»0)  •  !<».«>  -  .  ,  j  ♦  ■»  (T  > 

1  f »  -  »  1  L 

v  •  j*l(»,«)  »<’.e)  -  (.  ,  ♦  o  (T  ) 

lV  e  I  p 

■^iT’tUj  -  (»  -  *o>)>  =  «<O.Tj:'.’j'2)  -  K(0.IJ::) 


(A!.!) 


(tj22<»,«0)]' 1 


(dl*lil"I  the  equere  of  the  derleetlve  of  the  nten  by  the  e.tlancel 


(b>  w2. 

Let  •  -  (»,«)'  with  •  -  •  *  0(TJ’) 

<A2.J>  H  -  ■»‘1<i.»0>I<*,V 

where  »  -  »  -  O.CT'’’).  T'\i  end  Tj"1  »t«  »„«>.  •"* 
P  V 

i|l”,,Y(».e)l  =  UfO.T"1!).  lhen 


(A2.4)  *  °p<T  '  3  1<2>,o>  *  VT  !) 

»<».»>  fi  -  1  L 

(A2.5)  V(».80)  -  !«.«>  ♦  — a}  .  8  j 


where  the  argument  of  J  can  be  taken  to  be  It  f cl  leva  that 

the  efficacy  of  W?  ia  [TJ22<> ,9e> f 1 


(c)  W,\ 

Ue  ahall  ahew  that  X*  -  V  -  0  (I  **>  after  which  the  are-ra-rt  ’:*ed 
P 

for  Wj  can  be  reproduced  without  change.  The  key  reason  la  that 

X  -  1  appears  In  the  first  but  not  in  the  second  roepontnt  ot  «•  in  (A?.*), 

The  fact  that  X*  -  x  -  0  (T  S  follows  directly  froo  the  expression 
P 

for  X*.  A  wore  general  approach.  not  confined  to  this  particular 
problem,  follows 


0  -  Y,(».,9  )  -  Y.(X.*) 
l*o  1 


aYjO.A) 

»♦ 


<x. *>(*,-  e  ) 


?Y 

Tt 


-  JO.«)U  ♦  op(l)] 


(A2.8) 


I 


1 


I 


Sot*  that  If  and  arc  close  to  £ 


«/  •  V,  icp^>  -  9™ut  . 


If  2(i)  -  p (BjX  9v92  cloae  to  ?Q.  Pt(J>  -  pj2)  *  <»A  -  »2)«,l(e0> 


31ogp  (t  ) 


A.  U  Two  acxlal ration  probleas. 

(a)  The  uxIdui  value  of  (•*_£)*/(•' 8a)  1*  obtained  by  alnlalrtng  e'Ba 
subject  to  a *6.  •  K.  Applying  the  method  of  Lagrange  eultlpllere 


provided  B  la  nooalngular .  However,  In  our  application  By  -  0  and  6‘*  •  0. 
If  we  replace  B  by  f  -  B  ♦  v  y'and  a  by  a  ♦  S  v  wltha’v  •  0.  then 

(a  ♦  h  y) ’6  -  a 


It  la  clear  that  the  alnlaitlng  a  ♦  h  v  for  the  new  problea  coincide*  with 
that  of  the  original  problea.  Thus  t  ie  value  of  a  1*  F  ‘J  and 

the  aexlaua  value  of  (■  '4.)2/(s'  Ba)  1*  6'F  ^  4 ,  provided  that  F  la  nonsingular- 
(b)  To  alnlalra  Jp^u^  *q2  subject  to  [Pj4^  *  K  where  q^  •  £a^p^  uncer  the 
condition  that  ia^l  he  akew  eyaaetrlc.  The  condition  of  akew  symmetry  lcpllee 
•  0.  Mciv>v*r,  given  any  vector  fer  which  g*£  -  0,  the-*  is  a  s«ew 
•yanetrlc  A  for  which  -  Ag.  Hence,  for  >.ur  ministration  problea,  skew 
symmetry  la  equivalent  to  g'3  -  0,  snd  we  may  apply  the  aelhod  of  {.agranfe 
aultipllera  to  ainlalse  with  respect  to  q. 
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1 


-  V,«,  ♦  Vi 

"l  ■  "lV,  ♦  vjU, 

Th«  restrict  Ion*  yt.id 

'J’A'l  *  V2^i\  -  0 

•  K 

V2  •  -VjCC^MjWUj) 

«<o 

V.  •  - - - — - - 

e<“i>*«i“,)  -  [tiijOj)]2 

3  £(»,)«(«[»,)  - 
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ABSTRACT 

An  experiment  designed  to  detect  the  relative  motion  of  two  astro¬ 
nomical  objects  raised  the  problem  of  testing,  against  shift  alternatives, 
the  hypothesis  Hq  that  two  energy  distributions  are  equivalent.  The  rele¬ 
vant  data  consist  of  independent  Poisson  counts  with  means 
where  >.  is  the  intensity  of  radiation  from  the  j-th  object,  p^  is  tne 

probability  that  a  random  photon  from  the  j-th  object  has  energy  in  a 
small  interval  centered  about  e^  and  T^  is  the  time  duration  allocated 

to  the  count  X...  The  hypothesis  Hq  implies  that  P^  =  P.j  for  i  -  1,2,  .. 

A  natural  test  uses  the  statistic  ze^tp^  -  P^)  where  the  p.^  are 

estimates  of  p^.  For  intervals  where  the  p.^  were  anticipated  to  be  snail 

the  experimenter  chose  small  T.j  values  and  hence  those  p^  were  highly 

variable.  Consequently,  common  sense  suggests  that  the  corresponding  e., 

and  X..  be  omitted  in  the  above  statistic,  a  practice  which  may  be  regarded 

as  sinful  by  statistical  dogma.  This  issue  and  others  raised  by  the  effect 
of  small  T . j  lead  to  the  consideration  of  alternative  test  statistics  and 

their  relative  efficiencies  as  well  as  the  design  problem  of  selecting  T.j 
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